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(57) Abstract 



A Wallace- type binary tree multiplier (Fig. 3) in which the partial products (Fig. 2) of a multiplicand and a multiplier ate produced 
and titien succesavely reduced using a phuality of adder levels (LI, L2, L3, L4, Fig. 3) ccnnprised of full and half adders (FA, HA, Fig. 
3). This reduction continues until a final set of inputs (Level L4, Fig. 3) is produced wherein no more than two inputs remain to be added 
in any column. This final set is ifaen added using a serial adder (20) and a cany lookahead adder (21) to produce the desired product 
(po-pl5). The additions at leadi level are perfomied in accordance with prescribed rules to provide for C&stest overall operating speed, and 
miniminn lequiied chip area. In addition, the lengths of die serial adder (20) and carry lookahead adder (21) are chosen to further enhance 
speed while reducing required chip area. A still further enhancement in multiplier operating speed is achieved by providing connections to 
adders (Rg. 3) so as to take advantage of the different times of arrival of the inputs to each level Qevels LI, L2, L3, L4 in Fig. 3) along 
with diCFerent adder input-to>output delays. 
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ENHANCED FAST MULTIPLIER 



BACKGROUND OF THE INVENTION 
This invention relates generally to improved means and 
methods for performing arithmetic operations in a data 
processing system, and more particularly to an improved 
5 high speed binary multiplier provided on an integrated 
circuit chip. 

In designing a high speed binary multiplier on an 
integrated circuit chip, two important considerations are 
operating speed and required chips area . Most multiplier 
10 designs attempt to make some trade-off between the two, 
particularly where available chip area is limited, as is 
usually the case. 
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SUMMARY OF THE INVENTION 
The primary object of the present invention is to 
provide improved means and methods for increasing 
multiplier operating speed while providing a minimvim 
• 5 required chip area. 

In a particular preferred embodiment of the invention, 
a Wallace-type binary tree multiplier is provided in which 
the partial products of a multiplicand A and a multiplier 
B are produced and then reduced by successive addition 
10 using a plurality of adder levels comprised of full and 
half adders. This reduction continues until a final set of 
inputs is produced having no more than two inputs remaining 
to be added in any column. This final set is then added 
using two side-by-side final adders to produce the final 
15 product. In. this preferred embodiment, the particular 
inputs to be added by the full and half adders at each 
level are performed in accordance with prescribed rules to 
provide for fastest overall operating speed and minimum 
required chip area. In addition, the side-by-side final 
20 adders are chosen of prescribed type and length to take 
advantage of the earlier arrival times of least significant 
bits as compared to bits of higher significance. 

A still further enhancement in multiplier operating 
speed is achieved by taking advantage of the different 
25 times of arrival of the inputs to each level along with the 
fact that delays through the adder are typically different 
for different adder inputs and outputs . 

The specific nature of the invention as well as other 
objects, features, advantages and uses thereof will become 
30 evident from the following detailed description of a 
particular preferred embodiment taken in conjunction with 
the accompanying drawings . 

BRIEF DESCRIPTION OF THE DRAWINGS 
35 Fig. 1 illustrates the well known pencil -and -paper 

method of multiplication applied to binary multiplication. 
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Fig. 2 illustrates the method of Fig. 1 applied to the 
multiplication of an 8-bit multiplicand by an 8-bit 
multiplier. 

Fig. 3 is a block diagram illustrating a binary 
5 multiplier circuit in accordance with the invention. 

Fig. 4 is a block diagram illustrating a full adder FA 
which may be employed in the preferred binary multiplier 
circuit of Fig. 2. 

Fig. 5 is a block diagram illustrating a half adder HA 
10 which may be employed in the preferred binary multiplier 
circuit of Fig. 3. 

Figs. 6-10 schematically illustrate how the levels of 
Fig. 3 provide for progressively reducing the partial 
products in Fig. 2 to produce two final rows of inputs for 
15 application to a final addej: stage. 

Fig. 11 illustrates a preferred embodiment of the 
serial adder 20 in Fig. 3. 

Fig. 12 illustrates a modification of the connections 
to the adders in column 8 of level L2 for enhanced addition 
20 speed. 

DESCRIPTION OF A PREFERRED EMBODIMENT 
Like numerals and characters refer to like elements 
throughout the figures of the drawings. 

25 Initially, reference is directed to Fig. 1 which 

illustrates the well known paper-and-pencil method applied 
to the multiplication of a 4-bit binary multiplicand A=1100 
(decimal 12) by a 4-bit binary multiplier B=1101 (decimal 
13) to produce an 8-bit product P=10011100 (decimal 156). 

30 Fig. 2 illustrates the application of the well known 

paper-and-pencil method of Fig. 1 applied to the 
multiplication of an 8-bit multiplicand A=a7a6a5a4a3a2aiao by 
an 8-bit multiplier B^bybgbsb^babjbibo to produce a 16-bit 
product P=Pi5PuPi3Pi2PiiPiop9P8P7P6P5p/.P3P2PiPo Obtained by adding 

35 the "ab" partial products in each column. 

Fig. 3 is a preferred embodiment of a binary 
multiplier circuit which illustrates how the present 
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invention may preferably be implemented for performing the 
multiplication exemplified in Fig- 2- 

As shown in Fig. 3, the 8-bit multiplicand 
A=a7a6a5a^a3a2ajao and the 8-bit multiplier B=b7b6b5b4b3b2b,bo are 
5 applied to an initial multiplier 10 for producing the 

sixty-four partial products aobo aobj ayb? shown in 

Fig . 2 . Note that each partial product is in a respective 
column wherein each column corresponds to a respective 
product bit. These partial products are applied to a 

10 plurality of adder levels LI to L4 comprised of full adders 
FA and half adders HA for successively reducing the number 
of column inputs applied to each level until level L4 is 
reached which produces a final set of column inputs wherein 
no more than two inputs remain to be added in any column . 

15 These remaining are then .applied to side-by-side final 
adders 20 and 21 to produce the resulting product 

P=Pl5Pl AP13P12P1 lPl0P9P8p7P6P5P«P3P2PlP0 • 

Those skilled in the art will recognize that the 
preferred multiplier circuit illustrated in Fig. 3 is of 

20 the general Wallace type described in "A Suggestion for a 
Fast Multiplier", C.S. Wallace, Vol. 13, No. 14, IEEE 
Transactions on Electronic Computers (Feb. 1964), pp. 
14-17. The present invention enhances the basic Wallace 
approach in a manner not previously known or taught in the 

25 art in order to achieve, a significantly better combination 
of operating speed and required chip area. The manner in 
which these enhancements are incorporated in the preferred 
embodiment of Fig. 3 will next be described. 

First to be considered is the manner in which the full 

30 adders FA and half adders HA in the preferred multiplier 
circuit of Fig. 3 are chosen for reducing the sums output 
from each level. The basic rules used in Fig. 3 are as 
follows. For each column at each level, a full adder FA is 
connected to three-input groups until less than three 

35 inputs remain in the column at that level. If only two 
inputs remain in a column, or if the column initially has 
only two inputs, then an adder (preferably a half adder HA) 
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must be used for adding these two inputs if both of the 
following apply: (1) the adjacent less significant column 
will produce a carry into this column, and (2) this 
two-input addition is needed to achieve a 3-to-2 reduction 
5 ( rounded -down to the nearest integral ratio) of the number 
of inputs in this column. Any inputs not added at a level 
are passed on to the next level. 

Before describing how the above rule is applied in the 
preferred multiplier circuit of Fig. 3, attention is 

10 directed to Figs. 4 and 5 which respectively illustrate 
conventional implementations for the full adders FA and 
half adders HA in Fig. 3. In these figures, each gate 6 
perfo2rms an AND function, gate 7 performs an OR function 
and each gate 8 performs an exclusive OR function. Note 

15 that the full adder FA is slower than the half adder HA 
since it requires two gate levels compared to one for the 
half adder HA. Also note that the full adder FA requires 
significantly more chip area than does the half adder HA. 
Reference is next directed to Figs. 6-9 which 

20 schematically illustrate how the above described rules are 
applied at each level of the multiplier circuit of Fig. 3. 
Fig. 6 illustrates the inputs to level LI, which are the 
"ab" partial products shown in Fig. 2. 

Fig. 6 also indicates which "ab" inputs to level LI 

25 are added in each column by use of enclosing loops. If the 
loop encloses three inputs, a full adder FA (Fig. 4) is 
used, and if the loop encloses two inputs, a half adder HA 
(Fig. 5) is used. Each loop includes a designation of the 
resulting sum and carry of level LI which are applied as 

30 inputs to the next level L2. For example, adding ajbo/ a^b, 
and aob2 in column 2 results in a sum i2i and a carry izict 
the subscript "c" being used to identify a carry. Note 
that sum i2i produces an input in the same column 2 in the 
next level L2, while carry i2ic produces an input in column 

35 3 of level L2. 

Figs. 7, 8, 9 and 10 illustrate the inputs to levels 
L2, L3 and L4 , respectively, and are arranged similarly to 
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Fig. 6 with loops being provided for a like purpose. Note 
that the inputs to level L2 shown in Fig . 7 derived from 
adders in level LI use "i" designations, the inputs to 
level L3 shown in Fig. 8 derived from adders in level L2 
5 use "j" designations, the inputs to level L4 shown in Fig. 
9 derived from adders in level L3 use "k" designations, and 
the inputs to the final adders 20, 21 derived from adders 
in level L4 use "1" designations. The first subscript (or 
first two subscripts for columns greater than 9) of these 
10 i, j,k and 1 designations identify the column from which the 
Slim or carry was derived. The second subscript is merely 
used to distinguish sums and carrys derived from the same 
column. 

The manner in which the above rule is applied at each 
15 level in Fig. 3 will next, be considered in detail with 
reference to Figs. 6-10. 



Inputs to Level LI (Fig. 61 

It will be seen in Figs. 3 and 6 that each three-input 

20 group in each column of level LI is added by a full adder 
FA to produce a sum and a carry. For example, in column 2 
of level LI, ajbo, ajbj and aob2 are added by a full adder FA 
to produce a sum iji and a carry ijic- The sum "i2i" as 
applied to level L2 in Fig. 7 in the same corresponding 

25 column 2, while the carry "izu" is applied to the next 
column 3 in level L2, since that is where it is added. 
With regard to the remaining two-input groups in the 
columns of level LI, the above rules do not require any of 
these to be added in level LI, as will now be explained. 

30 It will be remembered that the above rules state that 

a two-input group in a column at a level are added if both 
of the following apply; (1) the adjacent less significant 
column will produce a carry into this column, and (2) 
addition is needed to achieve a 3-to-2 reduction (rounded- 

35 down to the nearest integer ratio) of the inputs in this 
column . 
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Only columns 1, 4, 1, 10/ 13 in level LI have a two- 
input group for which an add decision has to be made. The 
two inputs in column 1 meet neither requirement (1) or (2) 
of the rule. The remaining columns 4, 1, 10 and 13 also 
5 are not required to be added. Although they meet 
' requirement (1), they do not meet requirement (2). This 
will be understood by noting that the maximum number of 
inputs applied to level LI is eight (in column 7). Thus, 
in order to achieve a 3-to-2 reduction, the maximum number 

10 of inputs applied to level L2 in any column must be six or 
less (since an 8-to-6 reduction meets the 3-to-2 reduction 
requirement). The fact that addition of none of the two- 
input groups in level LI are required to meet the rules 
will become evident by noting that all of the columns in 

15 level L2 in Fig. 7 have six .or less inputs even though none 
of the two-input groups are added in level LI . 

Inputs to Level L2 fFiq. 7^ 

In level L2 in Fig. 7, only columns 1^ 6, 8, 10 arid 12 

20 require decisions to be made with regard to adding a two- 
input group. Column 1 need not be added since it does not 
meet either requirement (1) or (2) of the rules. The 
two-input groups in columns 1, 6, 10 and 12 need not be 
added. This will be evident from Fig. 8 which shows that, 

25 even though these two-input groups are not added, none of 
the effected columns in level L3 (Fig. 8) will have inputs 
which exceed the maximum of 4 dictated by the required 
3-to-2 reduction (6-to-4) for level L2. However, the rule 
requires adding of the two-input group a7bj , isi in column 8 

30 of level L2 (Fig. 7), since seven inputs would otherwise 
result in column 8 of level L3 (Fig. 8), exceeding the 
maximxim of 4 . By adding aybi , iai , the column 8 inputs to 
level L3 are held to the maximum of 4 . 

35 Level L3 (Fig. 8^ 

In level L3, decisions with respect to adding two- 
input groups need only be made for columns 1, 5, 11 and 14. 
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Columns 1 and 14 need not be added since they do not meet 
either requirement (1) or (2) of the rules- The two-input 
groups in columns 5, 11 and 14 also need not be added. 
Although they meet requirement (1)/ they do not meet 
5 requirement (2). This will be understood by noting that 
the required 3-to-2 reduction ratio (rounded-down to the 
nearest integer) is met for level L3 by providing a 4-to-3 
reduction (the nearest rounded-down integer ratio). 
This reduction is met for these columns since, even though 
10 these two input groups are not added, none of the effected 
columns in level L4 (Fig. 9) have inputs which exceed the 
maximum of 4 dictated by the required 4-to-3 reduction for 
level L3. 



15 Inputs to Level L4 ^Fiq. 9) 

In level L4, adding decisions need to be zaade only for 
columns 1, 6, 7, 13 and 14- Coliunns 1, 13 and 14 need not 
be added, since neither requirement (1) nor (2) of the rule 
is met for these columns. However, the two-input groups in 

20 columns 6 and 7 need to be added in order to meet the 
3-to-2 reduction requirement for level L4 , otherwise three 
inputs would be applied to the effected colximns of final 
adders 20 and 21, rather than the no more than two inputs 
per column applied to these adders 21 and 22 shown in Fig. 

25 10, 

It has thus been explained how additions of inputs are 
performed at each level of the preferred multiplier circuit 
in accordance with the prescribed rules . Although the 
preferred multiplier circuit of Fig. 3 performs only those 

30 two-input additions required by the rules, it is within the 
scope of the present invention to allow addition of a 
remaining two-input group in a column even though not 
required by the rules. For example, this may be done for 
routing convenience. Such instances should be limited to 

35 prevent exceeding a desired predetermined maximum chip 
area . 
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The remaining parts of the preferred multiplier 
circuit of Fig. 3 to be considered are the final adders 21 
and 22, which receive the two rows of inputs shown in 
Fig. 10. Note that the least significant product bit Pq is 
5 obtained directly from the partial product aobo in column 
"o" and need not be included in the final addition. 

Adder 21, receives the inputs from column 1-5 (Fig. 
10) and is chosen to be simply a conventional serial adder 
(without lookahead), such as illustrated in Fig. 11. Adder 

10 22 is chosen to be a conventional carry lookahead adder, 
which receives the output carry C5 from serial adder 21 and 
also the inputs from columns 6-15 (Fig. 10). This choice 
of length for the serial adder 21 in the preferred 
multiplier circuit of Fig. 3 is approximately 1/3 of the 

15 total nximber of columns required to be added in this final 
adder-stage, which has been found to be a preferred length 
for a Wallace-type multiplier. This choice of the two 
adders 21 and 22 takes advantage of signal flow through 
levels L1-L4 of the multiplier circuit of Fig. 3 which 

20 causes more significant inputs in Fig. 10 to arrive 
progressively later than less significant inputs. The 
number of stages provided for the serial adder 20 (Fig. 11) 
is chosen so that its output carry C5 (Fig. 3) arrives no 
later than the arrival of the column 6 inputs applied to 

25 the carry lookahead adder 21. It is most advantageous that 
the number of stages for the serial adder 20 be chosen so 
that its output carry C5 arrives at approximately the same 
time as the arrival of the column 6 inputs. Note that the 
serial adder 20 shown in Fig. 11 advantageously employs 

30 half adders, thereby permitting faster generation of the 
output carry C5 . 

The above choice of the adders 20 and 21 in Fig. 2, as 
described above, provides a further significant enhancement 
in multiplier performance. The serial adder 20, because of 

35 its simplicity, requires relatively less chip area; yet, 
its slower addition does not detract from overall adder 
speed, since its length is advantageously chosen so that 
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its output carry C3 is produced at approximately the same 
time as the arrival of the column 6 inputs applied to the 
carry lookahead adder 21. Also, because the use of the 
serial adder 20 permits a smaller carry lookahead adder 21 
5 to be used, it will be faster and require less chip area. 
As is well known, the propagation delay of a conventional 
carry lookahead adder increases progressively as the number 
of bit positions it is required to handle increases. 

A further enhancement in the operating speed of the 

10 multiplier circuit of Fig. 3 is provided in accordance with 
the invention by taking advantage of the different arrival 
times of the inputs to each level and the fact that input- 
to-output delays through the adders may be different for 
different adders as well as being different for different 

15 inputs of the same adder, . The present invention takes 
advantage of these differences to provide additional 
control for achieving a desired balance of multiplier speed 
and required chip area without having to change the number 
and type of adders provided in accordance with the 

20 previously described rules. An example of how this 
approach may be implemented in the preferred embodiment of 
Fig. 3 will next be presented. 

Assume that it is desired to enhance the speed of 
reduction for a particular column of the multiplier circuit 

25 of Fig. 3. In a preferred implementation, the following 
rules are used for applying inputs and adders for this 
particular column at each adder level: 

(1) If one or more inputs in this particular column 
are not to be added, select the slowest arriving inputs as 

30 these not to be added inputs and apply them to the next 
level without addition. 

(2) If there is a half adder in this particular 
column, then apply the next slowest arriving input to a 
first input of the half adder which provides the smallest 

35 delay to the adder sum output. Apply to the second half 
adder input one of the remaining arriving inputs in this 
particular column chosen such that the resulting half adder 
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sum output is caused to arrive due to this input no later 
than that due to the input applied to the first half adder 
input. Preferably, the arriving input applied to this 
second half adder . input is chosen so that the resulting 
5 half adder stun arrival due to this input most closely 
approximates that due to the input applied to the first 
half adder input. 

(3) Apply the remaining next slowest arriving input 
to a first input of a full adder which provides the 

10 shortest delay to the full adder sum output. Apply to the 
second and third full adder inputs particular ones of the 
remaining arriving inputs such that to the extent possible, 
the resulting full adder sum output is caused to arrive due 
to each of these inputs no later than due to the input 

15 applied to the first full adder input. Preferably, the 
inputs applied to these second and third full adder inputs 
are chosen so that the resulting full adder sum output 
arrival due to each approximates . that due to the input 
applied to the first full adder input. 

20 (4) Repeat (3) above for each of the remaining full 

adders in this particular column. 

Alternatively, the above rule (3) can be modified to 
begin with the fastest arriving signal. In such case, rule 
(3) would be replaced by the alternate rule (3)': 

25 (3)' Apply the fastest arriving input in this 

particular column to a first input of a full adder which 
provides the longest delay to the adder sum output. Apply 
to the second and third full adder inputs particular ones 
of the remaining arriving ' inputs such that to the extent 

30 possible, the resulting full adder sum output arrival due 
to each of these inputs is no sooner than due to the input 
applied to the first full adder input. Preferably, the 
inputs applied to the second and third full adder inputs 
are chosen so that the resulting full adder sum output 

35 arrival due to each approximates that due to the input 
applied to the first full adder input . 
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Next to be described is an example which demonstrates 
how the above rules (1), (2), (3) and (4) may be applied to 
the multiplier circuit of Fig. 3. It will be understood 
that input-to-output adder delays can be determined from 
5 information provided by the manufacture or by testing. 
Arrival times of inputs at each level can be determined 
from calculations, simulations or testing. For this 
example, it will be illustrated how the speed of addition 
may be enhanced for column 8 of level L2 (Fig. 7) which 
10 contains the five inputs a7bi, ig,, iaj, i^j^ and i72c- 

Initially, note in Fig. 7 that the first described 
rules applied to column 8 of level L2 resulted in applying 
three of the five inputs {i^2f hic and iyj^) to a full adder 

15 and the remaining two input, (aybi and igj) to a half adder. 
However, this application of the inputs to the adders did 
riot take into account the arriving times of the inputs or 
the different input-to-output delays provided by the 
adders. When this is done in accordance with the second 

20 described rules above, the column 8, level L2 inputs are 
applied to the adders in a different manner, as illustrated 
in Fig. 12, wherein the half adder is labeled 32 and the 
full adder is labeled 34. The half adder inputs are 
labeled 32a and 32b and the half adder sum and carry 

25 outputs are labeled 32^ and 32c respectively. The full 
adder inputs are labeled 34a, 34b and 34c and the full 
adder sum and carry outputs are labeled 34s and 34c 
respectively. 

The manner in which the illustrated Fig. 12 
30 connections are provided in accordance with the second 
described rules is set forth below: 
Rule (1) 

Since there are no inputs in column 8 of level L2 
which are not to be added, this rule is skipped. 
35 Ruler 2^ 

Since a half adder 32 (Fig. 12) is provided in column 
8 of level L2, rule (2) requires that the slowest arriving 
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input be applied to the fastest half adder input, which 
will be assumed to be 32a. The slowest arriving inputs in 
column 8 of level L2 are likely to be inputs iyj^ and iyjc- 
The reason is that these inputs Lyx^ and i72c are derived from 
5 slower arriving carry adder outputs from level LI. Also, 
since the "ab" inputs to level LI typically arrive at 
approximately the same time from multiplier 10 in Fig. 3, 
iyic or i72c can be expected to arrive at approximately the 
same time; thus, either may be chosen as the slowest 

10 arriving input in column 8 of level L2. For this example, 
iy^c is selected to be applied to the fastest half adder 
input 32a, as shown in Fig. 12. Also in accordance with 
rule (2), input ig, in column 8 of level L2 is applied to 
the other half adder input 32b since it is assumed for this 

15 example that it produces a half adder s\im arrival which 
most closely approximates that due to iji^ applied to half 
adder input 32a. 
Rule (3) 

The remaining three inputs in column 8 at level L2 are 

20 a7bi, ia2 and i72c which are applied to the full adder 34 in 
Fig. 12. For this example, it is assumed that 34a is the 
fastest full adder input, that 34b is the next fastest full 
adder input, and that 34c is the slowest full adder input. 
In accordance with rule (3), the next slowest input in 

25 column 8 of level L2 is applied to the fastest full adder 
input 34a. This next slowest signal is input iyjc (since it 
is a carry derived from level LI as explained above), and 
thus is applied to the fastest full adder input 34a. The 
next slowest arriving signal in column 8 of level L2 is 

30 obviously x^z, since the other remaining input ajh^ arrives 
much sooner as a result of having passed directly through 
level LI without addition. Thus, as shown in Fig. 12, i82 
is applied to full adder input 34b, while ayhi is applied to 
the slowest full adder input 34c. 

35 Although the above example has been limited to 

demonstrating how the addition speed of the inputs in 
column 8 of level LI may be enhanced, it is to be 



wo 94/12928 PCT/US93/11196 

- 14 - 

understood that other levels of column 8 as well as other 
columns of the multiplier circuit of Fig. 3 could have 
their speed enhanced in a like manner. It is also to be 
understood that, typically, only particular columns are 
5 speed sensitive so that this speed enhancement approach may 
be used in a multiplier selectively for one or more 
particular columns . 

It is further to be understood that since a speed 
sensitive column typically contains one or more inputs 
10 derived from the adjacent less significant column, the 
speed of addition for a speed-sensitive column may be 
further enhanced by choosing connections in this adjacent 
less significant column to take advantage of arrival times 
and adder delays so that carry inputs in the adjacent speed 
15 sensitive column are produced faster. 

From a global viewpoint, still further speed 
enhancements are possible by taking into account how input 
adder connections for all columns for a plurality of levels 
affect overall multiplier speed, and then taking advantage 
20 of arrival times and adder delays such that the fastest 
overall multiplier speed is obtained. 

Although the present invention has been described with 
respect to particular preferred embodiments, it is to be 
understood that the present invention is not limited to the 
25 preferred embodiments, since many variations in 
construction arrangement, use and operation are possible 
within the scope of the invention. For example, the 
invention can readily be adapted for use when the original 
multiplier and multiplicand are represented in two's 
30 complement binary encoded format. Also, the invention may 
be adapted for use with other types of multipliers, such as 
a Booth multiplier. 

Accordingly, the present invention is to be considered 
as encompassing all modifications, variations and 
35 adaptations coming within the scope of the invention as 
defined by the appended claims. 
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What is Claimed is : 

1 . In a binary multiplier circuit for multiplying an 
n-bit multiplicand by an m-bit multiplier to produce a 
multiple-bit product, the combination comprising: 

an initial binary multiplier to which signals 
5 representing the bits of said multiplier and multiplicand 
are applied for producing m+n partial product signals, each, 
partial product signal being in a respective column and 
each column corresponding to a respective product bit; and 

adder circuit* means to which said partial product 
10 signals are applied, said add circuit means comprising a 
plurality of adder levels for successively reducing the 
number of column inputs until a final set of inputs are 
produced having no more th^n two inputs remaining to be 
added in any column; 
15 said adder circuit means perfojcming addition at each 

level in accordance with the following rules: 

a) for each column at each level, three-input groups 
are added until less than three inputs remain in the column 
at that level; 

20 b) if only two inputs remain in a column after 

performing a) or if the column originally has only two 
inputs, then these two inputs must be added when both of 
the following apply: 

(1) the adjacent less significant column will produce 
25 a carry into this column, and 

(2) this two-input addition is needed to a achieve a 
3-to-2 reduction (rounded-down to the nearest integral 
ratio) for that level. 

2. The combination of claim 1, wherein each adder level 
adds each three-input group using a full adder and adds 
each two-input group using a half adder. 

3. The combination of claim 1 or 2, wherein a sum 
produced in a column at an adder level provides an input in 
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the same column of the next level while a carry provides an 
input in the next more significant column of the next 
5 level . 

4. The combination of claim 3, including final adder 
means for adding predetermined ones of said final set of 
column inputs to produce signals representing the bits of 
said product. 

5. The combination of claim 4, wherein said final adder 
means comprises a serial adder for adding inputs in a first 
predetermined number of columns of said final set and a 
carry lookahead adder for adding inputs in a second 

5 predetermined number of columns of said final set. 

6. The combination of claim 5, wherein said first 
predetermined number of columns are of less significance 
than said second predetermined number of columns, and 
wherein said serial adder produces an output carry which is 

5 applied as an input to said carry lookahead adder. 

7. The combination of claim 6, wherein at least the least 
significant column of said final set contains a single 
input which is used as the least significant product bit 
without being applied to said serial adder. 

8. The combination of claim 6, wherein said first 
predetermined nximber of columns is chosen to be 
approximately one-third of the total number of columns of 
said final set of inputs . 

9. The combination of claim 6, wherein said first 
predetermined number of columns is chosen so that said 
output carry arrives at said carry lookahead adder at 
approximately the same time as the arrival of the inputs in 
the least significant column applied to said lookahead 

5 adder . 
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10. The combination of claim 6, wherein said serial adder 
comprises a plurality of half adder stages. 



11. In a binary multiplier circuit for multiplying an 
n-bit multiplicand by an m-bit multiplier to produce a 
multiple-bit product^ the combination comprising: 

an initial binary multiplier to which signals 
5 representing the bits of said multiplier and multiplicand 
are applied for producing m+n partial product signals, each 
partial product signal being in a respective column and 
each column corresponding to a respective product bit; and 
adder circuit means to which said partial product 

10 signals are applied, said adder circuit means comprising a 
plurality of adder levels for successively reducing the 
number of coliimn inputs until a final set of inputs are 
produced having no more than two inputs remaining to be 
added in any column, said adder levels comprising adders 

15 providing different adder delays between adder inputs and 
outputs ; 

said adder circuit means performing addition at each 
level in accordance with the following rules: 

a) for each column at each level, three-input groups 
20 are added until less than three inputs remain in the column 

at that level; 

b) if only two inputs remain in a column after 
performing a) or if the column originally has only two 
inputs, then these two inputs must be added when both of 

25 the following apply: 

(1) the adjacent less significant column will produce 
a carry into this column, and 

(2) this two-input addition is needed to a achieve a 
3-to-2 reduction (rounded-down to the nearest integral 

30 ratio) for that level, 

wherein arriving inputs are connected for a particular 
column at a particular level in accordance with the 
following additional rules: 
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c) if one or more inputs in said particular column 
35 are not to be added at said particular level, select the 

slowest arriving inputs as these not to be added inputs and 
apply them to the next level without addition; 

d) if there is a half adder in said particular column 
at said particular level, apply the next slowest arriving 

40 input to a first half adder input which provides the 
smallest delay to the adder sum output, and apply to a 
second half adder input a remaining arriving input chosen 
such that the resulting half adder sum output is caused to 
arrive due to this input no later that due to the input 

45 applied to said first half adder input; 

e) apply the remaining next slowest arriving input to 
a first full adder input which provides the shortest delay 
to the full adder sum output, and apply to second and third 
full adder inputs particular ones of the remaining arriving 

50 inputs such that to the extent possible the resulting full 
adder sum output is caused to arrive due to each of these 
inputs no later than due to the input applied to said first 
full adder input; and 

f) repeat c) above for each remaining adder in said 
55 particular column at ,said particular level. 

12. The combination of claim 11, wherein an adder also 
produces an adder carry output, and wherein an adder sum 
output produced in a column at an adder level provides an 
input in the same column of the next level while an adder 

5 carry sum provides an input in the next more significant 
column of the next level. 

13. The combination of claim 12, including final adder 
means for adding predetermined ones of said final set of 
column inputs to produce signals representing the bits of 
said product. 

14. The combination of claim 13, wherein said final adder 
means comprises a serial adder for adding inputs in a first 
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predetermined number of columns of said final set and a 
carry lookahead adder for adding inputs in a second 
5 predetermined number of colximns of said final set. 

15. The combination of claim 14, wherein said first 
predetermined number of columns are of less significance 
than said second predetermined number of columns, and 
wherein said serial adder produces an output carry which is 
5 applied as an input to said carry lookahead adder. 

15. The combination of claim 15, wherein at least the 
least significant column of said final set contains a 
single input which is used as the least significant product 
bit without being applied to said serial adder. 

17. The combination of claim 15, wherein said first - 
predetermined number of columns is chosen to be 
approximately one-third of the total number of columns of 
said final set of inputs. 

18. The combination of claim 15, wherein said first 
predetermined number of columns is chosen so that said 
output carry arrives at said carry lookahead adder at 
approximately the Scune time as the arrival of the inputs in 

5 the least significant column applied to said lookahead 
adder . 



19. The combination of claim 15, wherein said serial adder 
comprises a plurality of half adder stages . 
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